Overview

Brought to you by YData

Dataset statistics

Number of variables 15
Number of observations 621762
Missing cells 767678
Missing cells (%) 8.2%
Duplicate rows 0
Duplicate rows (%) 0.0%
Total size in memory 71.2 MiB
Average record size in memory 120.0 B

Variable types

Categorical 4
Text 2
Numeric 9

Alerts

a is highly overall correlated with category and 2 other fields High correlation
b is highly overall correlated with nichd and 4 other fields High correlation
c is highly overall correlated with category and 2 other fields High correlation
d is highly overall correlated with category and 4 other fields High correlation
lwr is highly overall correlated with odds_ratio and 1 other fields High correlation
odds_ratio is highly overall correlated with lwr and 1 other fields High correlation
upr is highly overall correlated with lwr and 1 other fields High correlation
pvalue is highly overall correlated with nichd and 3 other fields High correlation
fdr is highly overall correlated with pvalue High correlation
category is highly overall correlated with atc_concept_class_id and 1 other fields High correlation
nichd is highly overall correlated with b and 2 other fields High correlation
atc_concept_class_id is highly overall correlated with category High correlation
meddra_concept_class_id is highly overall correlated with category High correlation
odds_ratio has 383839 (61.7%) missing values Missing
upr has 383839 (61.7%) missing values Missing
a is highly skewed (γ1 = 48.91934858) Skewed
c is highly skewed (γ1 = 46.18078914) Skewed
c has 383839 (61.7%) zeros Zeros

Reproduction

Analysis started 2025-04-28 13:34:34.232823
Analysis finished 2025-04-28 13:35:10.852315
Duration 36.62 seconds
Software version ydata-profiling vv4.16.1
Download configuration config.json

Variables

category
Categorical

High correlation 

Distinct 28
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 4.7 MiB
hlt_atc5
48305 
hlt_atc4
46083 
hlgt_atc5
44660 
hlt_atc3
42357 
hlgt_atc4
 
39421
Other values (23)
400936 

Length

Max length 9
Median length 8
Mean length 7.6795189
Min length 2

Characters and Unicode

Total characters 4774833
Distinct characters 15
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row soc
2nd row soc
3rd row soc
4th row soc
5th row soc

Common Values

Value Count Frequency (%)
hlt_atc5 48305
 
7.8%
hlt_atc4 46083
 
7.4%
hlgt_atc5 44660
 
7.2%
hlt_atc3 42357
 
6.8%
hlgt_atc4 39421
 
6.3%
hlt_atc2 38824
 
6.2%
pt_atc4 34716
 
5.6%
pt_atc3 33661
 
5.4%
soc_atc5 33634
 
5.4%
hlgt_atc3 33007
 
5.3%
Other values (18) 227094
36.5%

Length

2025-04-28T20:35:10.965157 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category
Value Count Frequency (%)
hlt_atc5 48305
 
7.8%
hlt_atc4 46083
 
7.4%
hlgt_atc5 44660
 
7.2%
hlt_atc3 42357
 
6.8%
hlgt_atc4 39421
 
6.3%
hlt_atc2 38824
 
6.2%
pt_atc4 34716
 
5.6%
pt_atc3 33661
 
5.4%
soc_atc5 33634
 
5.4%
hlgt_atc3 33007
 
5.3%
Other values (18) 227094
36.5%

Most occurring characters

Value Count Frequency (%)
t 1117990
23.4%
c 670820
14.0%
a 588243
12.3%
_ 578805
12.1%
h 374415
 
7.8%
l 374415
 
7.8%
g 161416
 
3.4%
p 155332
 
3.3%
4 144860
 
3.0%
5 132031
 
2.8%
Other values (5) 476506
10.0%

Most occurring categories

Value Count Frequency (%)
(unknown) 4774833
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
t 1117990
23.4%
c 670820
14.0%
a 588243
12.3%
_ 578805
12.1%
h 374415
 
7.8%
l 374415
 
7.8%
g 161416
 
3.4%
p 155332
 
3.3%
4 144860
 
3.0%
5 132031
 
2.8%
Other values (5) 476506
10.0%

Most occurring scripts

Value Count Frequency (%)
(unknown) 4774833
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
t 1117990
23.4%
c 670820
14.0%
a 588243
12.3%
_ 578805
12.1%
h 374415
 
7.8%
l 374415
 
7.8%
g 161416
 
3.4%
p 155332
 
3.3%
4 144860
 
3.0%
5 132031
 
2.8%
Other values (5) 476506
10.0%

Most occurring blocks

Value Count Frequency (%)
(unknown) 4774833
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
t 1117990
23.4%
c 670820
14.0%
a 588243
12.3%
_ 578805
12.1%
h 374415
 
7.8%
l 374415
 
7.8%
g 161416
 
3.4%
p 155332
 
3.3%
4 144860
 
3.0%
5 132031
 
2.8%
Other values (5) 476506
10.0%
Distinct 1520
Distinct (%) 0.2%
Missing 0
Missing (%) 0.0%
Memory size 4.7 MiB
2025-04-28T20:35:11.295549 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Length

Max length 92
Median length 61
Mean length 22.538116
Min length 3

Characters and Unicode

Total characters 14013344
Distinct characters 66
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row nan
2nd row nan
3rd row nan
4th row nan
5th row nan
Value Count Frequency (%)
and 95113
 
6.0%
for 64180
 
4.0%
agents 63246
 
4.0%
other 48816
 
3.1%
use 44238
 
2.8%
nan 42957
 
2.7%
systemic 39487
 
2.5%
system 37181
 
2.3%
antineoplastic 28568
 
1.8%
drugs 23906
 
1.5%
Other values (1525) 1099152
69.3%
2025-04-28T20:35:11.858732 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
965082
 
6.9%
A 798250
 
5.7%
S 789442
 
5.6%
T 770819
 
5.5%
I 729534
 
5.2%
E 668561
 
4.8%
N 634913
 
4.5%
i 558618
 
4.0%
O 550956
 
3.9%
e 513048
 
3.7%
Other values (56) 7034121
50.2%

Most occurring categories

Value Count Frequency (%)
(unknown) 14013344
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
965082
 
6.9%
A 798250
 
5.7%
S 789442
 
5.6%
T 770819
 
5.5%
I 729534
 
5.2%
E 668561
 
4.8%
N 634913
 
4.5%
i 558618
 
4.0%
O 550956
 
3.9%
e 513048
 
3.7%
Other values (56) 7034121
50.2%

Most occurring scripts

Value Count Frequency (%)
(unknown) 14013344
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
965082
 
6.9%
A 798250
 
5.7%
S 789442
 
5.6%
T 770819
 
5.5%
I 729534
 
5.2%
E 668561
 
4.8%
N 634913
 
4.5%
i 558618
 
4.0%
O 550956
 
3.9%
e 513048
 
3.7%
Other values (56) 7034121
50.2%

Most occurring blocks

Value Count Frequency (%)
(unknown) 14013344
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
965082
 
6.9%
A 798250
 
5.7%
S 789442
 
5.6%
T 770819
 
5.5%
I 729534
 
5.2%
E 668561
 
4.8%
N 634913
 
4.5%
i 558618
 
4.0%
O 550956
 
3.9%
e 513048
 
3.7%
Other values (56) 7034121
50.2%
Distinct 10192
Distinct (%) 1.6%
Missing 0
Missing (%) 0.0%
Memory size 4.7 MiB
2025-04-28T20:35:12.238907 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Length

Max length 92
Median length 67
Mean length 29.770192
Min length 3

Characters and Unicode

Total characters 18509974
Distinct characters 71
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 95 ?
Unique (%) < 0.1%

Sample

1st row Blood and lymphatic system disorders
2nd row Cardiac disorders
3rd row Congenital, familial and genetic disorders
4th row Ear and labyrinth disorders
5th row Endocrine disorders
Value Count Frequency (%)
and 207179
 
9.5%
disorders 187334
 
8.6%
nec 94150
 
4.3%
infections 37833
 
1.7%
conditions 30838
 
1.4%
system 27865
 
1.3%
vascular 26193
 
1.2%
congenital 25614
 
1.2%
tissue 23314
 
1.1%
excl 21037
 
1.0%
Other values (5908) 1507194
68.9%
2025-04-28T20:35:12.840942 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
i 1568338
 
8.5%
1566789
 
8.5%
e 1547802
 
8.4%
s 1509049
 
8.2%
a 1399576
 
7.6%
n 1293112
 
7.0%
r 1278261
 
6.9%
o 1224436
 
6.6%
t 1056465
 
5.7%
d 925120
 
5.0%
Other values (61) 5141026
27.8%

Most occurring categories

Value Count Frequency (%)
(unknown) 18509974
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
i 1568338
 
8.5%
1566789
 
8.5%
e 1547802
 
8.4%
s 1509049
 
8.2%
a 1399576
 
7.6%
n 1293112
 
7.0%
r 1278261
 
6.9%
o 1224436
 
6.6%
t 1056465
 
5.7%
d 925120
 
5.0%
Other values (61) 5141026
27.8%

Most occurring scripts

Value Count Frequency (%)
(unknown) 18509974
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
i 1568338
 
8.5%
1566789
 
8.5%
e 1547802
 
8.4%
s 1509049
 
8.2%
a 1399576
 
7.6%
n 1293112
 
7.0%
r 1278261
 
6.9%
o 1224436
 
6.6%
t 1056465
 
5.7%
d 925120
 
5.0%
Other values (61) 5141026
27.8%

Most occurring blocks

Value Count Frequency (%)
(unknown) 18509974
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
i 1568338
 
8.5%
1566789
 
8.5%
e 1547802
 
8.4%
s 1509049
 
8.2%
a 1399576
 
7.6%
n 1293112
 
7.0%
r 1278261
 
6.9%
o 1224436
 
6.6%
t 1056465
 
5.7%
d 925120
 
5.0%
Other values (61) 5141026
27.8%

nichd
Categorical

High correlation 

Distinct 7
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 4.7 MiB
toddler
99552 
early_childhood
98782 
term_neonatal
93902 
infancy
92464 
middle_childhood
86523 
Other values (2)
150539 

Length

Max length 17
Median length 16
Mean length 12.735064
Min length 7

Characters and Unicode

Total characters 7918179
Distinct characters 16
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row early_adolescence
2nd row early_adolescence
3rd row early_adolescence
4th row early_adolescence
5th row early_adolescence

Common Values

Value Count Frequency (%)
toddler 99552
16.0%
early_childhood 98782
15.9%
term_neonatal 93902
15.1%
infancy 92464
14.9%
middle_childhood 86523
13.9%
early_adolescence 78619
12.6%
late_adolescence 71920
11.6%

Length

2025-04-28T20:35:13.032921 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-28T20:35:13.198733 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Value Count Frequency (%)
toddler 99552
16.0%
early_childhood 98782
15.9%
term_neonatal 93902
15.1%
infancy 92464
14.9%
middle_childhood 86523
13.9%
early_adolescence 78619
12.6%
late_adolescence 71920
11.6%

Most occurring characters

Value Count Frequency (%)
e 1074817
13.6%
d 893299
11.3%
l 865142
10.9%
o 714603
9.0%
a 680128
8.6%
c 578847
7.3%
n 523271
 
6.6%
_ 429746
 
5.4%
r 370855
 
4.7%
h 370610
 
4.7%
Other values (6) 1416861
17.9%

Most occurring categories

Value Count Frequency (%)
(unknown) 7918179
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
e 1074817
13.6%
d 893299
11.3%
l 865142
10.9%
o 714603
9.0%
a 680128
8.6%
c 578847
7.3%
n 523271
 
6.6%
_ 429746
 
5.4%
r 370855
 
4.7%
h 370610
 
4.7%
Other values (6) 1416861
17.9%

Most occurring scripts

Value Count Frequency (%)
(unknown) 7918179
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
e 1074817
13.6%
d 893299
11.3%
l 865142
10.9%
o 714603
9.0%
a 680128
8.6%
c 578847
7.3%
n 523271
 
6.6%
_ 429746
 
5.4%
r 370855
 
4.7%
h 370610
 
4.7%
Other values (6) 1416861
17.9%

Most occurring blocks

Value Count Frequency (%)
(unknown) 7918179
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
e 1074817
13.6%
d 893299
11.3%
l 865142
10.9%
o 714603
9.0%
a 680128
8.6%
c 578847
7.3%
n 523271
 
6.6%
_ 429746
 
5.4%
r 370855
 
4.7%
h 370610
 
4.7%
Other values (6) 1416861
17.9%

atc_concept_class_id
Categorical

High correlation 

Distinct 6
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 4.7 MiB
ATC4
142592 
ATC5
126599 
ATC3
123600 
ATC2
108366 
ATC1
77648 

Length

Max length 4
Median length 4
Mean length 3.9309109
Min length 3

Characters and Unicode

Total characters 2444091
Distinct characters 10
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row nan
2nd row nan
3rd row nan
4th row nan
5th row nan

Common Values

Value Count Frequency (%)
ATC4 142592
22.9%
ATC5 126599
20.4%
ATC3 123600
19.9%
ATC2 108366
17.4%
ATC1 77648
12.5%
nan 42957
 
6.9%

Length

2025-04-28T20:35:13.404717 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-28T20:35:13.778362 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Value Count Frequency (%)
atc4 142592
22.9%
atc5 126599
20.4%
atc3 123600
19.9%
atc2 108366
17.4%
atc1 77648
12.5%
nan 42957
 
6.9%

Most occurring characters

Value Count Frequency (%)
A 578805
23.7%
T 578805
23.7%
C 578805
23.7%
4 142592
 
5.8%
5 126599
 
5.2%
3 123600
 
5.1%
2 108366
 
4.4%
n 85914
 
3.5%
1 77648
 
3.2%
a 42957
 
1.8%

Most occurring categories

Value Count Frequency (%)
(unknown) 2444091
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
A 578805
23.7%
T 578805
23.7%
C 578805
23.7%
4 142592
 
5.8%
5 126599
 
5.2%
3 123600
 
5.1%
2 108366
 
4.4%
n 85914
 
3.5%
1 77648
 
3.2%
a 42957
 
1.8%

Most occurring scripts

Value Count Frequency (%)
(unknown) 2444091
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
A 578805
23.7%
T 578805
23.7%
C 578805
23.7%
4 142592
 
5.8%
5 126599
 
5.2%
3 123600
 
5.1%
2 108366
 
4.4%
n 85914
 
3.5%
1 77648
 
3.2%
a 42957
 
1.8%

Most occurring blocks

Value Count Frequency (%)
(unknown) 2444091
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
A 578805
23.7%
T 578805
23.7%
C 578805
23.7%
4 142592
 
5.8%
5 126599
 
5.2%
3 123600
 
5.1%
2 108366
 
4.4%
n 85914
 
3.5%
1 77648
 
3.2%
a 42957
 
1.8%

meddra_concept_class_id
Categorical

High correlation 

Distinct 9
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 4.7 MiB
HLT
212999 
HLGT
161416 
PT
155332 
SOC
82577 
ATC5
 
5432
Other values (4)
 
4006

Length

Max length 4
Median length 3
Mean length 3.0249645
Min length 2

Characters and Unicode

Total characters 1880808
Distinct characters 14
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row SOC
2nd row SOC
3rd row SOC
4th row SOC
5th row SOC

Common Values

Value Count Frequency (%)
HLT 212999
34.3%
HLGT 161416
26.0%
PT 155332
25.0%
SOC 82577
 
13.3%
ATC5 5432
 
0.9%
ATC4 2268
 
0.4%
ATC3 1110
 
0.2%
ATC2 530
 
0.1%
ATC1 98
 
< 0.1%

Length

2025-04-28T20:35:13.976335 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-28T20:35:14.159476 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Value Count Frequency (%)
hlt 212999
34.3%
hlgt 161416
26.0%
pt 155332
25.0%
soc 82577
 
13.3%
atc5 5432
 
0.9%
atc4 2268
 
0.4%
atc3 1110
 
0.2%
atc2 530
 
0.1%
atc1 98
 
< 0.1%

Most occurring characters

Value Count Frequency (%)
T 539185
28.7%
H 374415
19.9%
L 374415
19.9%
G 161416
 
8.6%
P 155332
 
8.3%
C 92015
 
4.9%
S 82577
 
4.4%
O 82577
 
4.4%
A 9438
 
0.5%
5 5432
 
0.3%
Other values (4) 4006
 
0.2%

Most occurring categories

Value Count Frequency (%)
(unknown) 1880808
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
T 539185
28.7%
H 374415
19.9%
L 374415
19.9%
G 161416
 
8.6%
P 155332
 
8.3%
C 92015
 
4.9%
S 82577
 
4.4%
O 82577
 
4.4%
A 9438
 
0.5%
5 5432
 
0.3%
Other values (4) 4006
 
0.2%

Most occurring scripts

Value Count Frequency (%)
(unknown) 1880808
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
T 539185
28.7%
H 374415
19.9%
L 374415
19.9%
G 161416
 
8.6%
P 155332
 
8.3%
C 92015
 
4.9%
S 82577
 
4.4%
O 82577
 
4.4%
A 9438
 
0.5%
5 5432
 
0.3%
Other values (4) 4006
 
0.2%

Most occurring blocks

Value Count Frequency (%)
(unknown) 1880808
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
T 539185
28.7%
H 374415
19.9%
L 374415
19.9%
G 161416
 
8.6%
P 155332
 
8.3%
C 92015
 
4.9%
S 82577
 
4.4%
O 82577
 
4.4%
A 9438
 
0.5%
5 5432
 
0.3%
Other values (4) 4006
 
0.2%

a
Real number (ℝ)

High correlation  Skewed 

Distinct 376
Distinct (%) 0.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 2.1828208
Minimum 1
Maximum 1278
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 4.7 MiB
2025-04-28T20:35:14.361810 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 1
Q1 1
median 1
Q3 2
95-th percentile 5
Maximum 1278
Range 1277
Interquartile range (IQR) 1

Descriptive statistics

Standard deviation 10.96342
Coefficient of variation (CV) 5.0225928
Kurtosis 3383.6455
Mean 2.1828208
Median Absolute Deviation (MAD) 0
Skewness 48.919349
Sum 1357195
Variance 120.19657
Monotonicity Not monotonic
2025-04-28T20:35:14.559255 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
1 465118
74.8%
2 79067
 
12.7%
3 27516
 
4.4%
4 13993
 
2.3%
5 7861
 
1.3%
6 5364
 
0.9%
7 3693
 
0.6%
8 2800
 
0.5%
9 2225
 
0.4%
10 1672
 
0.3%
Other values (366) 12453
 
2.0%
Value Count Frequency (%)
1 465118
74.8%
2 79067
 
12.7%
3 27516
 
4.4%
4 13993
 
2.3%
5 7861
 
1.3%
6 5364
 
0.9%
7 3693
 
0.6%
8 2800
 
0.5%
9 2225
 
0.4%
10 1672
 
0.3%
Value Count Frequency (%)
1278 1
< 0.1%
1256 1
< 0.1%
1117 1
< 0.1%
1025 1
< 0.1%
1000 2
< 0.1%
996 1
< 0.1%
987 1
< 0.1%
975 1
< 0.1%
961 1
< 0.1%
924 1
< 0.1%

b
Real number (ℝ)

High correlation 

Distinct 988
Distinct (%) 0.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 5858.8967
Minimum 3503
Maximum 6792
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 4.7 MiB
2025-04-28T20:35:14.743929 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 3503
5-th percentile 4283
Q1 5373
median 6267
Q3 6521
95-th percentile 6792
Maximum 6792
Range 3289
Interquartile range (IQR) 1148

Descriptive statistics

Standard deviation 856.28301
Coefficient of variation (CV) 0.1461509
Kurtosis -1.0372225
Mean 5858.8967
Median Absolute Deviation (MAD) 524
Skewness -0.67791726
Sum 3.6428394 × 109
Variance 733220.59
Monotonicity Not monotonic
2025-04-28T20:35:14.934129 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
6521 73479
11.8%
6792 73103
11.8%
6267 68719
11.1%
6332 67381
10.8%
5376 65952
10.6%
4784 60853
9.8%
4283 55634
8.9%
6791 13178
 
2.1%
6266 12787
 
2.1%
6520 12555
 
2.0%
Other values (978) 118121
19.0%
Value Count Frequency (%)
3503 1
< 0.1%
3654 1
< 0.1%
3714 1
< 0.1%
3768 1
< 0.1%
3783 1
< 0.1%
3795 1
< 0.1%
3808 1
< 0.1%
3815 1
< 0.1%
3843 1
< 0.1%
3844 1
< 0.1%
Value Count Frequency (%)
6792 73103
11.8%
6791 13178
 
2.1%
6790 4563
 
0.7%
6789 2431
 
0.4%
6788 1356
 
0.2%
6787 918
 
0.1%
6786 649
 
0.1%
6785 496
 
0.1%
6784 366
 
0.1%
6783 304
 
< 0.1%

c
Real number (ℝ)

High correlation  Skewed  Zeros 

Distinct 607
Distinct (%) 0.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 2.9494839
Minimum 0
Maximum 2433
Zeros 383839
Zeros (%) 61.7%
Negative 0
Negative (%) 0.0%
Memory size 4.7 MiB
2025-04-28T20:35:15.136979 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 0
Q1 0
median 0
Q3 1
95-th percentile 10
Maximum 2433
Range 2433
Interquartile range (IQR) 1

Descriptive statistics

Standard deviation 25.291387
Coefficient of variation (CV) 8.5748517
Kurtosis 2944.789
Mean 2.9494839
Median Absolute Deviation (MAD) 0
Skewness 46.180789
Sum 1833877
Variance 639.65425
Monotonicity Not monotonic
2025-04-28T20:35:15.337267 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
0 383839
61.7%
1 86677
 
13.9%
2 42626
 
6.9%
3 24844
 
4.0%
4 16263
 
2.6%
5 11178
 
1.8%
6 8166
 
1.3%
7 6158
 
1.0%
8 4874
 
0.8%
9 3983
 
0.6%
Other values (597) 33154
 
5.3%
Value Count Frequency (%)
0 383839
61.7%
1 86677
 
13.9%
2 42626
 
6.9%
3 24844
 
4.0%
4 16263
 
2.6%
5 11178
 
1.8%
6 8166
 
1.3%
7 6158
 
1.0%
8 4874
 
0.8%
9 3983
 
0.6%
Value Count Frequency (%)
2433 1
< 0.1%
2400 1
< 0.1%
2365 1
< 0.1%
2329 1
< 0.1%
2324 1
< 0.1%
2256 1
< 0.1%
2229 1
< 0.1%
2214 1
< 0.1%
2180 1
< 0.1%
2138 1
< 0.1%

d
Real number (ℝ)

High correlation 

Distinct 2283
Distinct (%) 0.4%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 12110.258
Minimum 10675
Maximum 15154
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 4.7 MiB
2025-04-28T20:35:15.527227 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 10675
5-th percentile 11147
Q1 11340
median 11662
Q3 13017
95-th percentile 13472
Maximum 15154
Range 4479
Interquartile range (IQR) 1677

Descriptive statistics

Standard deviation 905.0035
Coefficient of variation (CV) 0.074730322
Kurtosis 0.0020492053
Mean 12110.258
Median Absolute Deviation (MAD) 514
Skewness 0.8714147
Sum 7.5296984 × 109
Variance 819031.34
Monotonicity Not monotonic
2025-04-28T20:35:15.715977 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
11148 60824
 
9.8%
11340 60011
 
9.7%
11662 58852
 
9.5%
11577 55940
 
9.0%
12433 50641
 
8.1%
13019 46196
 
7.4%
13472 42456
 
6.8%
11147 13781
 
2.2%
11339 13318
 
2.1%
11576 12622
 
2.0%
Other values (2273) 207121
33.3%
Value Count Frequency (%)
10675 1
< 0.1%
10687 1
< 0.1%
10770 1
< 0.1%
10843 1
< 0.1%
10900 1
< 0.1%
10901 1
< 0.1%
10902 1
< 0.1%
10913 1
< 0.1%
10917 1
< 0.1%
10925 2
< 0.1%
Value Count Frequency (%)
15154 154
 
< 0.1%
15153 1395
0.2%
15152 714
0.1%
15151 525
 
0.1%
15150 334
 
0.1%
15149 240
 
< 0.1%
15148 194
 
< 0.1%
15147 139
 
< 0.1%
15146 131
 
< 0.1%
15145 122
 
< 0.1%

lwr
Real number (ℝ)

High correlation 

Distinct 15006
Distinct (%) 2.4%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 0.13981449
Minimum 0.00134153
Maximum 21.657279
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 4.7 MiB
2025-04-28T20:35:15.904833 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 0.00134153
5-th percentile 0.013911187
Q1 0.042079303
median 0.047706522
Q3 0.080633861
95-th percentile 0.59066964
Maximum 21.657279
Range 21.655937
Interquartile range (IQR) 0.038554557

Descriptive statistics

Standard deviation 0.27049641
Coefficient of variation (CV) 1.9346809
Kurtosis 234.72954
Mean 0.13981449
Median Absolute Deviation (MAD) 0.02205731
Skewness 8.3332671
Sum 86931.334
Variance 0.073168307
Monotonicity Not monotonic
2025-04-28T20:35:16.103492 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
0.04207930329 53570
 
8.6%
0.04458263212 53380
 
8.6%
0.04770652161 51138
 
8.2%
0.04687272307 48970
 
7.9%
0.05928846718 45906
 
7.4%
0.06976383139 42109
 
6.8%
0.08063386062 38623
 
6.2%
0.02090460533 9949
 
1.6%
0.02214827468 9732
 
1.6%
0.02328601866 9260
 
1.5%
Other values (14996) 259125
41.7%
Value Count Frequency (%)
0.001341529984 2
< 0.1%
0.001377042144 1
< 0.1%
0.001386303225 1
< 0.1%
0.001416783902 1
< 0.1%
0.001421115887 1
< 0.1%
0.001434031858 1
< 0.1%
0.001520387968 1
< 0.1%
0.001530173917 1
< 0.1%
0.001540246981 1
< 0.1%
0.001567899446 1
< 0.1%
Value Count Frequency (%)
21.65727862 2
< 0.1%
13.87425078 2
< 0.1%
11.31111151 1
 
< 0.1%
10.72333579 1
 
< 0.1%
9.782325697 4
< 0.1%
9.057418938 1
 
< 0.1%
8.309549972 1
 
< 0.1%
7.689418325 1
 
< 0.1%
7.675235773 1
 
< 0.1%
7.57435195 3
< 0.1%

odds_ratio
Real number (ℝ)

High correlation  Missing 

Distinct 14807
Distinct (%) 6.2%
Missing 383839
Missing (%) 61.7%
Infinite 0
Infinite (%) 0.0%
Mean 1.9091598
Minimum 0.054568098
Maximum 67.490004
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 4.7 MiB
2025-04-28T20:35:16.319556 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 0.054568098
5-th percentile 0.36552637
Q1 0.86805772
median 1.6410821
Q3 2.3124051
95-th percentile 5.2176204
Maximum 67.490004
Range 67.435436
Interquartile range (IQR) 1.4443474

Descriptive statistics

Standard deviation 1.7351204
Coefficient of variation (CV) 0.90883979
Kurtosis 39.608329
Mean 1.9091598
Median Absolute Deviation (MAD) 0.77174846
Skewness 3.9942356
Sum 454233.03
Variance 3.0106429
Monotonicity Not monotonic
2025-04-28T20:35:16.587379 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
1.64108209 9949
 
1.6%
1.738701245 9732
 
1.6%
1.828006996 9260
 
1.5%
2.312405113 8781
 
1.4%
1.860701112 8536
 
1.4%
2.720975481 7659
 
1.2%
3.14502416 6616
 
1.1%
1.156131878 3757
 
0.6%
0.8693336331 3677
 
0.6%
0.8205592596 3528
 
0.6%
Other values (14797) 166428
26.8%
(Missing) 383839
61.7%
Value Count Frequency (%)
0.05456809791 2
< 0.1%
0.05533288192 1
< 0.1%
0.05623415523 1
< 0.1%
0.05733603857 1
< 0.1%
0.05781733035 1
< 0.1%
0.05847617725 1
< 0.1%
0.06164440307 1
< 0.1%
0.0617468172 1
< 0.1%
0.06298545453 1
< 0.1%
0.06374354861 1
< 0.1%
Value Count Frequency (%)
67.49000404 1
 
< 0.1%
42.55681534 1
 
< 0.1%
42.14805892 1
 
< 0.1%
39.32585759 1
 
< 0.1%
39.00186398 1
 
< 0.1%
38.3726715 4
< 0.1%
34.06995123 1
 
< 0.1%
33.75510211 1
 
< 0.1%
32.72349883 1
 
< 0.1%
31.51351638 2
< 0.1%

upr
Real number (ℝ)

High correlation  Missing 

Distinct 14807
Distinct (%) 6.2%
Missing 383839
Missing (%) 61.7%
Infinite 0
Infinite (%) 0.0%
Mean 80.693309
Minimum 0.25019618
Maximum 2762.9908
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 4.7 MiB
2025-04-28T20:35:16.830024 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 0.25019618
5-th percentile 1.5377134
Q1 5.1828331
median 17.880295
Q3 143.35945
95-th percentile 272.46588
Maximum 2762.9908
Range 2762.7406
Interquartile range (IQR) 138.17662

Descriptive statistics

Standard deviation 105.87483
Coefficient of variation (CV) 1.3120645
Kurtosis 9.0860545
Mean 80.693309
Median Absolute Deviation (MAD) 15.583414
Skewness 1.9677921
Sum 19198794
Variance 11209.479
Monotonicity Not monotonic
2025-04-28T20:35:17.066657 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
128.7133343 9949
 
1.6%
136.3626593 9732
 
1.6%
143.3594517 9260
 
1.5%
181.279412 8781
 
1.4%
145.9067123 8536
 
1.4%
213.2548254 7659
 
1.2%
246.4172615 6616
 
1.1%
22.21640539 3757
 
0.6%
16.70275849 3677
 
0.6%
15.76484814 3528
 
0.6%
Other values (14797) 166428
26.8%
(Missing) 383839
61.7%
Value Count Frequency (%)
0.2501961769 1
< 0.1%
0.2751686898 1
< 0.1%
0.2920582145 1
< 0.1%
0.3075935421 1
< 0.1%
0.3083758884 1
< 0.1%
0.311709364 1
< 0.1%
0.3151973965 1
< 0.1%
0.3178243271 1
< 0.1%
0.3213858444 1
< 0.1%
0.321828071 1
< 0.1%
Value Count Frequency (%)
2762.9908 1
 
< 0.1%
1802.461975 1
 
< 0.1%
1732.307922 1
 
< 0.1%
1664.52855 1
 
< 0.1%
1642.197459 1
 
< 0.1%
1571.938528 4
< 0.1%
1442.508241 1
 
< 0.1%
1437.77005 1
 
< 0.1%
1389.395936 1
 
< 0.1%
1358.36701 2
< 0.1%

pvalue
Real number (ℝ)

High correlation 

Distinct 13942
Distinct (%) 2.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 0.43273837
Minimum 5.8866777 × 10-87
Maximum 1
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 4.7 MiB
2025-04-28T20:35:17.284159 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 5.8866777 × 10-87
5-th percentile 0.072220807
Q1 0.26875983
median 0.35360134
Q3 0.46529885
95-th percentile 1
Maximum 1
Range 1
Interquartile range (IQR) 0.19653902

Descriptive statistics

Standard deviation 0.28001915
Coefficient of variation (CV) 0.64708649
Kurtosis 0.078877221
Mean 0.43273837
Median Absolute Deviation (MAD) 0.084841511
Skewness 1.0632428
Sum 269060.27
Variance 0.078410725
Monotonicity Not monotonic
2025-04-28T20:35:17.693656 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
1 98602
15.9%
0.3786299537 53570
 
8.6%
0.3651326839 53380
 
8.6%
0.3495817066 51138
 
8.2%
0.35360134 48970
 
7.9%
0.3019090399 45906
 
7.4%
0.2687598293 42109
 
6.8%
0.2412705564 38623
 
6.2%
0.5126808458 8781
 
1.4%
0.4652988517 7659
 
1.2%
Other values (13932) 173024
27.8%
Value Count Frequency (%)
5.886677744 × 10-87 1
< 0.1%
3.108247758 × 10-51 1
< 0.1%
9.940723947 × 10-40 1
< 0.1%
8.709728709 × 10-29 1
< 0.1%
4.53904082 × 10-26 1
< 0.1%
3.485339269 × 10-25 1
< 0.1%
2.514910737 × 10-22 1
< 0.1%
3.691974821 × 10-21 1
< 0.1%
1.039725295 × 10-20 1
< 0.1%
6.220832436 × 10-20 1
< 0.1%
Value Count Frequency (%)
1 98602
15.9%
0.9673017561 1
 
< 0.1%
0.9609244748 1
 
< 0.1%
0.957249295 1
 
< 0.1%
0.9520740118 1
 
< 0.1%
0.9489849933 1
 
< 0.1%
0.9477859839 1
 
< 0.1%
0.9466495942 1
 
< 0.1%
0.9453009213 1
 
< 0.1%
0.9448824863 1
 
< 0.1%

fdr
Real number (ℝ)

High correlation 

Distinct 4302
Distinct (%) 0.7%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 0.75951294
Minimum 6.373506 × 10-83
Maximum 1
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 4.7 MiB
2025-04-28T20:35:17.917645 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 6.373506 × 10-83
5-th percentile 0.69389984
Q1 0.69389984
median 0.69389984
Q3 0.78384793
95-th percentile 1
Maximum 1
Range 1
Interquartile range (IQR) 0.08994809

Descriptive statistics

Standard deviation 0.13567836
Coefficient of variation (CV) 0.17863864
Kurtosis 3.1759362
Mean 0.75951294
Median Absolute Deviation (MAD) 0
Skewness 0.15154829
Sum 472236.28
Variance 0.018408616
Monotonicity Not monotonic
2025-04-28T20:35:18.121717 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
0.6938998431 434281
69.8%
1 110645
 
17.8%
0.8170273935 10111
 
1.6%
0.7838479333 8608
 
1.4%
0.751372918 7286
 
1.2%
0.8719330968 5806
 
0.9%
0.9755106201 2461
 
0.4%
0.9297289142 2204
 
0.4%
0.7568148028 2045
 
0.3%
0.9486363334 1851
 
0.3%
Other values (4292) 36464
 
5.9%
Value Count Frequency (%)
6.373505994 × 10-83 1
< 0.1%
2.234559099 × 10-47 1
< 0.1%
5.689899432 × 10-36 1
< 0.1%
3.143341091 × 10-25 1
< 0.1%
1.411407675 × 10-22 1
< 0.1%
1.040554407 × 10-21 1
< 0.1%
6.384186158 × 10-19 1
< 0.1%
8.507076782 × 10-18 1
< 0.1%
2.318460989 × 10-17 1
< 0.1%
1.312275841 × 10-16 1
< 0.1%
Value Count Frequency (%)
1 110645
17.8%
0.9999064712 12
 
< 0.1%
0.9998377323 1
 
< 0.1%
0.999628494 1
 
< 0.1%
0.9987463441 12
 
< 0.1%
0.9986199302 2
 
< 0.1%
0.9984328232 1
 
< 0.1%
0.9979517332 1
 
< 0.1%
0.9973352938 1
 
< 0.1%
0.9973306005 1
 
< 0.1%

Interactions

2025-04-28T20:35:05.737199 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:51.984418 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:53.678962 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:55.589600 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:57.323257 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:58.953872 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:00.630017 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:02.243303 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:03.994226 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:05.938847 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:52.192525 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:53.865227 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:55.783086 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:57.520301 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:59.150978 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:00.809917 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:02.415967 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:04.189134 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:06.134767 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:52.390913 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:54.064940 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:55.972805 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:57.717644 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:59.342039 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:00.998119 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:02.602560 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:04.392308 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:06.331482 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:52.577457 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:54.255759 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:56.163428 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:57.896556 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:59.533194 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:01.170217 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:02.775340 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:04.586320 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:06.527368 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:52.751492 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:54.651219 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:56.344797 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:58.068653 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:59.706105 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:01.377986 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:02.941735 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:04.791679 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:06.686971 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:52.913641 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:54.816543 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:56.523306 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:58.226444 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:59.873941 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:01.551302 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:03.105175 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:04.953274 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:06.858305 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:53.086066 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:54.989679 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:56.736447 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:58.395377 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:00.053745 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:01.732781 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:03.463907 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:05.132461 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:07.071471 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:53.293530 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:55.196175 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:56.936038 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:58.598500 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:00.260752 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:01.896406 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:03.620812 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:05.339431 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:07.259727 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:53.479014 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:55.385379 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:57.127920 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:34:58.777387 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:00.451273 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:02.064595 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:03.780836 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:35:05.545524 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Correlations

2025-04-28T20:35:18.260024 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
a b c d lwr odds_ratio upr pvalue fdr
a 1.000 -0.002 0.940 0.042 0.235 -0.007 -0.084 -0.047 -0.073
b -0.002 1.000 -0.029 -0.899 -0.051 -0.126 -0.100 0.182 0.079
c 0.940 -0.029 1.000 0.071 0.137 -0.089 -0.122 -0.004 -0.029
d 0.042 -0.899 0.071 1.000 0.132 0.111 0.035 -0.159 -0.058
lwr 0.235 -0.051 0.137 0.132 1.000 0.620 0.139 -0.345 -0.350
odds_ratio -0.007 -0.126 -0.089 0.111 0.620 1.000 0.797 -0.501 -0.520
upr -0.084 -0.100 -0.122 0.035 0.139 0.797 1.000 -0.261 -0.275
pvalue -0.047 0.182 -0.004 -0.159 -0.345 -0.501 -0.261 1.000 0.900
fdr -0.073 0.079 -0.029 -0.058 -0.350 -0.520 -0.275 0.900 1.000
2025-04-28T20:35:18.447124 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
a b c d lwr odds_ratio upr pvalue fdr
a 1.000 -0.078 0.500 0.009 0.724 0.156 -0.345 -0.239 0.040
b -0.078 1.000 -0.078 -0.924 -0.375 -0.150 -0.032 0.376 0.031
c 0.500 -0.078 1.000 0.009 -0.069 -0.677 -0.957 0.453 0.703
d 0.009 -0.924 0.009 1.000 0.338 0.168 0.083 -0.390 -0.088
lwr 0.724 -0.375 -0.069 0.338 1.000 0.548 0.068 -0.675 -0.405
odds_ratio 0.156 -0.150 -0.677 0.168 0.548 1.000 0.836 -0.397 -0.420
upr -0.345 -0.032 -0.957 0.083 0.068 0.836 1.000 -0.063 -0.111
pvalue -0.239 0.376 0.453 -0.390 -0.675 -0.397 -0.063 1.000 0.811
fdr 0.040 0.031 0.703 -0.088 -0.405 -0.420 -0.111 0.811 1.000
2025-04-28T20:35:18.632444 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
a b c d lwr odds_ratio upr pvalue fdr
a 1.000 -0.072 0.456 0.006 0.599 0.122 -0.257 -0.199 0.037
b -0.072 1.000 -0.063 -0.820 -0.331 -0.128 -0.038 0.335 0.025
c 0.456 -0.063 1.000 -0.006 -0.095 -0.509 -0.844 0.358 0.629
d 0.006 -0.820 -0.006 1.000 0.311 0.135 0.090 -0.346 -0.070
lwr 0.599 -0.331 -0.095 0.311 1.000 0.462 0.142 -0.658 -0.335
odds_ratio 0.122 -0.128 -0.509 0.135 0.462 1.000 0.680 -0.317 -0.345
upr -0.257 -0.038 -0.844 0.090 0.142 0.680 1.000 -0.060 -0.103
pvalue -0.199 0.335 0.358 -0.346 -0.658 -0.317 -0.060 1.000 0.717
fdr 0.037 0.025 0.629 -0.070 -0.335 -0.345 -0.103 0.717 1.000
2025-04-28T20:35:18.827283 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
category nichd atc_concept_class_id meddra_concept_class_id a b c d lwr odds_ratio upr pvalue fdr
category 1.000 0.050 1.000 1.000 0.609 0.483 0.651 0.638 0.066 0.066 0.118 0.474 0.411
nichd 0.050 1.000 0.025 0.025 0.000 0.928 0.000 0.917 0.015 0.036 0.086 0.542 0.240
atc_concept_class_id 1.000 0.025 1.000 0.494 0.077 0.062 0.084 0.607 0.043 0.035 0.048 0.225 0.183
meddra_concept_class_id 1.000 0.025 0.494 1.000 0.359 0.235 0.405 0.335 0.040 0.034 0.063 0.295 0.257
a 0.609 0.000 0.077 0.359 1.000 0.752 0.904 0.135 0.000 0.000 0.000 0.063 0.142
b 0.483 0.928 0.062 0.235 0.752 1.000 0.750 0.956 0.017 0.034 0.089 0.657 0.325
c 0.651 0.000 0.084 0.405 0.904 0.750 1.000 0.125 0.000 0.000 0.000 0.071 0.153
d 0.638 0.917 0.607 0.335 0.135 0.956 0.125 1.000 0.042 0.062 0.114 0.673 0.358
lwr 0.066 0.015 0.043 0.040 0.000 0.017 0.000 0.042 1.000 0.781 0.745 0.137 0.463
odds_ratio 0.066 0.036 0.035 0.034 0.000 0.034 0.000 0.062 0.781 1.000 0.984 0.370 0.369
upr 0.118 0.086 0.048 0.063 0.000 0.089 0.000 0.114 0.745 0.984 1.000 0.385 0.368
pvalue 0.474 0.542 0.225 0.295 0.063 0.657 0.071 0.673 0.137 0.370 0.385 1.000 0.933
fdr 0.411 0.240 0.183 0.257 0.142 0.325 0.153 0.358 0.463 0.369 0.368 0.933 1.000
2025-04-28T20:35:19.040422 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
atc_concept_class_id category meddra_concept_class_id nichd
atc_concept_class_id 1.000 1.000 0.271 0.015
category 1.000 1.000 1.000 0.019
meddra_concept_class_id 0.271 1.000 1.000 0.013
nichd 0.015 0.019 0.013 1.000

Missing values

2025-04-28T20:35:07.607209 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
A simple visualization of nullity by column.
2025-04-28T20:35:08.607795 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-04-28T20:35:10.353913 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

category atc_concept_name meddra_concept_name nichd atc_concept_class_id meddra_concept_class_id a b c d lwr odds_ratio upr pvalue fdr
0 soc nan Blood and lymphatic system disorders early_adolescence nan SOC 217 4568 653 14000 0.866220 1.018470 1.193877 8.092209e-01 1.000000e+00
1 soc nan Cardiac disorders early_adolescence nan SOC 178 4607 896 13757 0.500478 0.593221 0.700061 7.467369e-11 6.264063e-08
2 soc nan Congenital, familial and genetic disorders early_adolescence nan SOC 167 4618 1113 13540 0.370352 0.439948 0.519963 4.539041e-26 1.411408e-22
3 soc nan Ear and labyrinth disorders early_adolescence nan SOC 30 4755 157 14496 0.379877 0.582571 0.866441 6.130941e-03 3.923877e-01
4 soc nan Endocrine disorders early_adolescence nan SOC 148 4637 335 14318 1.113272 1.364126 1.665167 2.276053e-03 2.024703e-01
5 soc nan Eye disorders early_adolescence nan SOC 241 4544 703 13950 0.901897 1.052442 1.224866 5.102971e-01 8.170274e-01
6 soc nan Gastrointestinal disorders early_adolescence nan SOC 423 4362 1141 13512 1.019414 1.148358 1.291900 2.166934e-02 6.938998e-01
7 soc nan General disorders and administration site conditions early_adolescence nan SOC 358 4427 1120 13533 0.860940 0.977114 1.107072 7.297870e-01 1.000000e+00
8 soc nan Hepatobiliary disorders early_adolescence nan SOC 155 4630 377 14276 1.041479 1.267684 1.537119 1.637289e-02 6.938998e-01
9 soc nan Immune system disorders early_adolescence nan SOC 224 4561 621 14032 0.944744 1.109734 1.299903 1.915546e-01 6.938998e-01
category atc_concept_name meddra_concept_name nichd atc_concept_class_id meddra_concept_class_id a b c d lwr odds_ratio upr pvalue fdr
621752 pt_atc4 Xanthines Wrong patient received medication toddler ATC4 PT 1 6792 0 11148 0.042079 NaN NaN 0.378630 0.693900
621753 pt_atc4 Other antineoplastic agents Wrong technique in product usage process toddler ATC4 PT 1 6792 0 11148 0.042079 NaN NaN 0.378630 0.693900
621754 pt_atc4 Second-generation cephalosporins X-ray abnormal toddler ATC4 PT 1 6792 0 11148 0.042079 NaN NaN 0.378630 0.693900
621755 pt_atc4 Progesterone receptor modulators X-ray limb abnormal toddler ATC4 PT 1 6792 0 11148 0.042079 NaN NaN 0.378630 0.693900
621756 pt_atc4 Other drugs affecting bone structure and mineralization Xanthogranuloma toddler ATC4 PT 1 6792 0 11148 0.042079 NaN NaN 0.378630 0.693900
621757 pt_atc4 Interleukin inhibitors Xerophthalmia toddler ATC4 PT 1 6792 0 11148 0.042079 NaN NaN 0.378630 0.693900
621758 pt_atc4 Selective immunosuppressants Xerophthalmia toddler ATC4 PT 1 6792 0 11148 0.042079 NaN NaN 0.378630 0.693900
621759 pt_atc4 Protein kinase inhibitors Xerosis toddler ATC4 PT 2 6791 1 11147 0.170858 3.282652 193.470808 0.561398 0.871933
621760 pt_atc4 Corticosteroids, potent (group III) Yawning toddler ATC4 PT 1 6792 0 11148 0.042079 NaN NaN 0.378630 0.693900
621761 pt_atc4 Vitamin D and analogues Zinc deficiency toddler ATC4 PT 1 6792 0 11148 0.042079 NaN NaN 0.378630 0.693900